Academic Search Engine Spam and Google Scholar’s Resilience Against it
نویسندگان
چکیده
In a previous paper we provided guidelines for scholars on optimizing research articles for academic search engines such as Google Scholar. Feedback in the academic community to these guidelines was diverse. Some were concerned researchers could use our guidelines to manipulate rankings of scientific articles and promote what we call ‘academic search engine spam’. To find out whether these concerns are justified, we conducted several tests on Google Scholar. The results show that academic search engine spam is indeed— and with little effort—possible: We increased rankings of academic articles on Google Scholar by manipulating their citation counts; Google Scholar indexed invisible text we added to some articles, making papers appear for keyword searches the articles were not relevant for; Google Scholar indexed some nonsensical articles we randomly created with the paper generator SciGen; and Google Scholar linked to manipulated versions of research papers that contained a Viagra advertisement. At the end of this paper, we discuss whether academic search engine spam could become a serious threat to Web-based academic search engines.
منابع مشابه
Google Scholar’s Ranking Algorithm: An Introductory Overview
Google Scholar is one of the major academic search engines but its ranking algorithm for academic articles is unknown. We performed the first steps to reverse-engineering Google Scholar’s ranking algorithm and present the results in this research-in-progress paper. The results are: Citation counts is the highest weighed factor in Google Scholar’s ranking algorithm. Therefore, highly cited artic...
متن کاملAnalysis of Web Spam for Non-English Content: Toward More Effective Language-Based Classifiers
Web spammers aim to obtain higher ranks for their web pages by including spam contents that deceive search engines in order to include their pages in search results even when they are not related to the search terms. Search engines continue to develop new web spam detection mechanisms, but spammers also aim to improve their tools to evade detection. In this study, we first explore the effect of...
متن کاملPutting Google Scholar to the test: a preliminary study
Purpose – To describe a small-scale quantitative evaluation of the scholarly information search engine, Google Scholar. Design/methodology/approach – Google Scholar’s ability to retrieve scholarly information was compared to that of three popular search engines: Ask.com, Google and Yahoo! Test queries were presented to all four search engines and the following measures were used to compare them...
متن کاملAn exploratory study of Google Scholar
Purpose – The purpose of this paper is to discuss the new scientific search service Google Scholar (GS). It aims to discuss this search engine, which is intended exclusively for searching scholarly documents, and then empirically test its most important functionality. The focus is on an exploratory study which investigates the coverage of scientific serials in GS. Design/methodology/approach – ...
متن کاملUsing Spam Farm to Boost PageRank
Today people have become more and more dependent on search engines such as Google, Yahoo, and MSN, etc., for their information needs. Web spamming has emerged to take the economic advantage of high search rankings and threatened the accuracy and fairness of those rankings. Understanding spamming techniques is essential for evaluating the strength and weakness of a ranking algorithm, and for fig...
متن کامل